首页> 外文OA文献 >Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation
【2h】

Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation

机译:独立于说话人的深层模型的置换不变训练   多讲者语音分离

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We propose a novel deep learning model, which supports permutation invarianttraining (PIT), for speaker independent multi-talker speech separation,commonly known as the cocktail-party problem. Different from most of the priorarts that treat speech separation as a multi-class regression problem and thedeep clustering technique that considers it a segmentation (or clustering)problem, our model optimizes for the separation regression error, ignoring theorder of mixing sources. This strategy cleverly solves the long-lasting labelpermutation problem that has prevented progress on deep learning basedtechniques for speech separation. Experiments on the equal-energy mixing setupof a Danish corpus confirms the effectiveness of PIT. We believe improvementsbuilt upon PIT can eventually solve the cocktail-party problem and enablereal-world adoption of, e.g., automatic meeting transcription and multi-partyhuman-computer interaction, where overlapping speech is common.
机译:我们提出了一种新颖的深度学习模型,该模型支持置换不变训练(PIT),用于与说话者无关的多说话者语音分离,通常被称为鸡尾酒会问题。与大多数将语音分离视为多类回归问题的现有技术以及将其视为分割(或聚类)问题的深度聚类技术不同,我们的模型针对分离回归误差进行了优化,而忽略了混合源的顺序。该策略巧妙地解决了长期存在的标签置换问题,该问题阻碍了基于深度学习的语音分离技术的进步。丹麦语料库的等能量混合设置实验证明了PIT的有效性。我们认为,基于PIT的改进最终可以解决鸡尾酒会问题,并使现实世界能够采用自动会议转录和多方人机交互(其中重叠的语音很常见)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号